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Abstract. We present the first algorithm for generating random variates exactly uniformly from 
the set of perfect matchings of a bipartite graph with a polynomial expected running time over a 
nontrivial set of graphs. Previous Markov chain results obtain approximately uniform variates for 
arbitrary graphs in polynomial time, but their general running time is 0(n 26 (lnn) 2 ). Our algorithm 
employs acceptance/rejection together with a new upper limit on the permanent of a form similar 
to Bregman's Theorem. For a graph with 2n nodes where the degree of every node is nearly 
7n for a constant 7, the expected running time is 0(n 1,5+,5//7 ). Under these conditions, Jerrum 
and Sinclair showed that a Markov chain of Broder can generate approximately uniform variates 
in Q(n 4 " 5+ ' 5 ^ 1 In n) time, making our algorithm significantly faster on this class of graph. With 
our approach, approximately counting the number of perfect matchings (equivalent to finding the 
permanent of a 0-1 matrix and so JP complete) can be done without use of selfreducibility. We give 
an 1 + a approximation algorithm for finding the permanent of 0-1 matrices with nearly identical 
row and column sums that runs in time 0(n 15+ ' 5 ^ 7 ^j- log |), where the probability the output is 
within 1 + a of the permanent is at least 1 — 5. 

Key Words. Approximation algorithms, perfect matchings, perfect sampling, permanent, Breg- 
man's theorem, Mine's conjecture 

1. The Problem. Given a bipartite graph (V, E) with n nodes in each of the two parti- 
tions (so \V\ = 2n), a perfect matching is a subset of edges such that each node is adjacent 
to exactly one edge. Let fi denote the set of all such perfect matchings. Generating random 
variates uniformly from D has a variety of applications, such as determining the p values 
of tests for doubly truncated data For general graphs, Jerrum, Sinclair, and Vigoda [5] 
gave the first polynomial time algorithm for generating samples that are arbitrarily close to 
uniform, but their method requires 6(n 26 (lnn) 2 ) steps. 

Generation algorithms can also be used in constructing algorithms for approximately 
counting Q. This problem is equivalent to finding the permanent of a 0-1 matrix. Computing 
the permanent of a matrix was one of the first problems shown to be (JP complete |T3]- When 
restricted to 0-1 matrices where the row and column sums are at least n/2, the problem 
remains jjP complete. We restrict further the problem to the case where each edge has 
at least 717 edges and the graph is nearly regular. Even under all these restrictions, the 
problem remains jjP complete 0. 

Algorithms for approximating the permanent have been constructed using a variety of 
techniques, including decomposition of the problem jlOj and using determinants of related 
random matrices pQ . Using selfreducibility of the problem jHj , any polynomial time technique 
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for generating samples can be used to efficiently approximate |0|, but the number of samples 
needed can be fairly large. The Jerrum, Sinclair, and Vigoda method constructs such an 
estimate along with a means for generating samples simultaneously-the first sample requires 
the most time, making the approximation algorithm 0(r£ 26 (lnn) 2 ) as well. 

For bipartite graphs with minimum degree n/2, the Markov chain of Broder 3 requires 
0(n 5 Inn) time to generate approximate samples [5], and will more generally be polynomial 
when the ratio of the number of almost matchings (matchings with n — 1 edges) to the 
number of perfect matchings is polynomial [ 1 3 j . 

Given the 0(n 26 (lnn) 2 ) technique for generating approximate samples, two questions 
arise. First, can this running time be significantly reduced by restricting the class of graphs 
we consider? Second, can samples be drawn exactly from the uniform distribution, as 
opposed to the approximate samples created by Markov chains? 

An exact sampler is preferable to approximate samplers for several reasons. A desirable 
property of stochastic algorithms is that they be unbiased, that the expected value of the 
output be equal to the actual output. Finding unbiased algorithms for applications such as 
estimating p values relies on the output coming exactly from the desired distribution. A 
more important reason in practice is that algorithms coming from Markov chains are not 
only 0(T), they are 0(T) where T is the mixing time of the chain. That is, T steps must 
always be taken in order to insure that the output is close to the desired distribution. On 
the other hand, exact sampling algorithms are often truly O(T'), where T' is merely an 
upper bound on the running time that in practice is never reached. 

In the remainder of this paper, we present an exact sampling technique that is guaranteed 
to run in polynomial time over a nontrivial set of graphs. Roughly speaking, when the degree 
of each node is close to jn, the algorithm takes time O(n 1 ' 5+0 ' 5 ' 7 ). 

Also, this method naturally gives an algorithm for approximately counting the number 
of perfect matchings without the need to take advantage of selfreducibility. This makes 
implementation easier, and also makes the running time similar to that needed to draw a 
sample: the running time is 0(S-^g log fij) time, where S is the expected time needed to 
obtain a single random variate. 

The permanent of a matrix A is defined as 

n 

per(A) := ^ II a »^)- 

7reS„ i=l 

When the matrix is 0-1 the terms of the permanent are or 1. They are nonzero precisely 
when {{1, 7r(l)}, . . . , {n, 7r(n)}} is a perfect matching in the bipartite graph where there is 
an edge from the ith node of one partition to the jth node of the other partition if and only 
if ay = 1. This makes computing the permanent of a 0-1 matrix equivalent to counting the 
number of perfect matchings. 

We will only be able to draw from perfect matchings where the graph is nearly regular. 
Although our method can be run on instances which are far from being regular and will 
generate an exact sample when it terminates, it is not guaranteed to terminate in polynomial 
time unless the graph is nearly regular with approximate degree A that is at least 771. 



3 



1.1 Bregman's Theorem. The heart of our work is a new inequality for permanents that 
is of the same form but slightly weaker than Bregman's Theorem. Although Bregman's 
theorem is stronger, it cannot be used algorithmically to generate samples, whereas our new 
inequality can be used in such a fashion. 

First conjectured by Mine an d proved by Bregman Bregman's Theorem gives 
the following upper bound on the permanent: 

n 

M(A) := Y[(r{i)l) 1/r(l) > per{A), (1) 
i=i 

where r(i) is the sum of the elements of the «th row of A. 

Rather than using the Bregman factors a! 1 /", we will use the following factors, defined 
recursively in the following fashion: 

5 (1) := e, g(a + 1) = g(a) + 1 + + £L Va > 1. (2) 



Lemma 1 For all 0-1 matrices A, with row sums r 

M(A) = 17 > per ( A ). (3) 

■ L - L e 

The factors in J2J are fairly close to the factors in Bregman's theorem. Using Stirling's 
formula, the ratio of the two can be shown to converge to 1. 
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Van der Waerden's Inequality. To lower bound the number of perfect matchings, we will 
use Van der Waerden's Inequality , which says that any doubly stochastic n by n matrix 
has permanent at least n\/n n . 

Suppose that our graph is regular, or equivalently, that all the row and column sums of 
the matrix A are A. Then dividing each row by A leaves us with a new doubly stochastic 
matrix A' with per(A') — A~ n per(A). Applying Van der Waerden's inequality gives us 
per(A) > A n n\/n n . Using Stirling's Inequality we have that n! > n ,l e _,l v / 27m, so our 
lower bound becomes (A/e) n y/2nn. This differs from the upper bound by a factor that is 
polynomial when A is of the same order as n. 

2. The algorithm. The basic form of the algorithm is not new, it is simply acceptance 
rejection. What is new is the form of M(A) that allows us to use the selfreducibility of 
the permanent problem together with the acceptance/rejection protocol. We will choose a 
number uniformly between 1 and M(A) inclusive. If this number corresponds to a valid 
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Figure 1 : Example of reduced matrices 

perfect matching, we accept, otherwise we reject and must begin again. We will build up 
the perfect matching (permutation) corresponding to our random number one entry at a 
time. That is, we move through the columns one at a time, choosing a different row to go 
with each column. 

Let a denote our permutation. When choosing row a(i) to go with column i, update the 
matrix by zeroing out column i and row cr{i), except for the entry A(a(i), i) that remains 1. 
This leaves a smaller version of our original problem. Let f(A,i,j) denote this new matrix. 

Consider the example of Figure l2~l For the first column, we can choose rows 1, 2, or 
3 to add to our permutation. Each choice leaves a reduced matrix f(A, 1, 1), f(A, 2, 1) 
or f(A, 3,1) from which to choose the remaining permutation. If our random number 
from 1 to M(A) lies in {1, . . . , M(f(A, 1, 1))}, we set a(l) = 1. If the number lies in 
{M(f(A, 1, 1)) + 1, ... , M(f(A, 1, 1)) + M(f (A, 2, 1))} we set a(l) = 2, and so on. The 
probability of selecting a particular permutation will be 1/M(A), and so conditioned on 
acceptance the algorithm chooses a permutation with probability l/per(A). 

In practice the user will not choose at the beginning of the algorithm a number from 1 to 
M(A). Instead, the user assigns row a(j) to column j with probability M(f(A, a(j),j))/M(A) 
at each step. The probability of ending at a particular permutation a will then be: 

M(/(A,tx(l),l)) A7(/(/Q4,<7(2),2))) M(f(f(---f{A,a(n),n))) _ 1 

M(A) M(f(A,a(l),l)) M(f(f(---f(A,a(n-l),n-l)) M(A) ' 

Pseudocode for this procedure is given in Figure |2~1 

Important note: to have a valid algorithm we need Y^i:A(i,j)=i ^(/(A h j))/M{A) < 1. 
Otherwise the factors in the product in @ will not be probabilities. In fact, this is the 
very reason we use M(A) rather than M(A). Suppose that A is a 5 by 5 matrix whose 
row sums are all 4 and first column is all ones. Then Y^i-AU j)=i -^(/(A h 1)) = 54.5, but 
M(A) = 53.1. Since 54.5/53.1 > 1, the algorithm would fail at this point. However, for this 
example Yli-.AU j)=i -^(/(A h 1)) = 64.3, and M(A) — 65.1, and so our algorithm proceeds 
without difficulty In Section^we show in general that J2tA(i j)=i ^(/(A h 1)) — M(A). 

2.1 Estimating the permanent. Jerrum, Valiant, and Vazirani showed how to turn a sam- 
pling algorithm into an approximate counting algorithm for selfreducible problems [5], but 
we will not need to employ their method here. 
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Generate Random Perfect Matching 




Input: Square 0-1 n x n matrix A 




Output: Perfect matching it. 




Repeat 




Let ACCEPT «- TRUE, Let A*— A, Let r be the row sums 


of A 


For j from 1 to n 




If f(i) = for any i or there exists i ^ i' such that A(i,j) = 


A(i',j)=r{i) = r{i') = 1 


Let ACCEPT <- FALSE 




Else 




Choose R at random from {1, . . . , n + 1} using 




P(iJ = t) = [M(/(i,z,i))/M(i)]l(I(i,i) =1) for 1 <i 


< n 


P(R = n + l) = l- E l p (R = i ) 




Ul<R<n 




Let 7r(j) <- i? 




Let A /(A, R, j), Let r be the row sums of A 




Else 




Let ACCEPT <- FALSE 




Until ACCEPT = TRUE 





Figure 2: Generate a perfect matching 

Instead, we simply note that per(A)/M(A) is exactly the probability that the algorithm 
creates a valid permutation on any given run through the columns. M(A) is of course easy 
to compute in 0(n\og \M(A)\) time, we then just keep track of the number of acceptances 
over the number of attempts, and multiplying by M(A) gives an approximation to per(A). 
Standard Chernoff bounds 0] show that after a~ 2 log(l/<5) samples, the estimate will be 
correct to within a factor of 1 + a with probability at least 1 — 5. 

3. Analyzing the algorithm. Because we are just sampling uniformly from the num- 
bers 1, . . . , M(A), if our algorithm succeeds, it reaches any particular permutation with 
probability 1/M(A), and conditional on reaching a valid permutation (by which we mean 
one that corresponds to a perfect matching), the probability of hitting a particular permu- 
tation is just [l/M(A)]/[per(A)/M (A)] = l/per(A) exactly as desired. 

The only remaining question is whether or not each step in the algorithm can be suc- 
cessfully performed. 
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Lemma 2 For all 0-1 n by n matrices A, and j G {1, . . . , n}, 

n 

~]A(i,j)M(f(A,i,j))<M(A). 



i=l 



(5) 



Begin the proof of Lemma|21by noting that M(f(A, i,j)) and M(A) each contain many 
of the same (positive) terms that can be canceled on both sides of the equation. Without 
loss of generality, assume that elements A(l,j) through A(cj,j) are 1 and the remaining 
elements of column j are zero. Then when we consider M(f(A,i,j)) for i < Cj, the i term 
goes from g(ji) down to <?(1), Cj — 1 row sums of A are reduced by 1. When i > Cj, row i is 
the same in f(A, i,j) as in A, and so the row sum is also the same. Hence 



M(f(A,i,j)) 
M(A) 

Therefore we have 

EjA(iJ)M(f(A,i,j)) 
M(A) 



g(r(k) - 1) 



k<Cj 



g(r(k) - 1) 
g(r(k)) ■ 



n 

Lfc=l 



9(r(k) - 1) 
g(r(k)) 



E 



i 9(r(k) - 1) 



(6) 



and we are interested in showing that this expression is bounded above by 1. We can rewrite 
the RHS by denoting g(r(k) — 1) = a^, and using g(r(k)) = a\- + 1 + 0.5/a^ + 0.2/ajj. Set 



s(a k ) 
p(a k ) 



I/a* 

1/[1 + I/a*: + 0.5/a^ + .6/al 



.k=l 



(Ik , 



(7) 

(8) 

(9) 



When a is the vector of row sums for the first Cj rows of A, then h(a) equals the RHS of 
®), and so our goal is to show that h(a) is bounded above by 1. 

In fact, we will show that /i(a) < 1 for all a such that g(l) < a k < g(n — 1) for all k. 
That is we allow a to vary continuously over the region rather than restricting it to the 
finite set of values {5(1), • ■ ■ ,g(n — 1)}- Since the region is compact and h continuous, h 
attains its maximum at a particular point a*. 

Suppose that we evaluate h at a vector a so that a k < ai for some k 7^ i. Consider the 
factors and terms of h that depend on a k and ai . Given a, construct a new vector a where 
ak — a and ag is chosen so that 



s(dfc) + s(ae) = s(a) + s(ae). 



(10) 



Let hk,t{p,) — /i(a). Then we can write hk,i(fl) — Cp(a)p(ai), where C is a constant. Differ- 
entiating, we find 

h 'kA a ) = C \p' ( a )p(">t) - p(a)p(ae)s'(a)/s'(ai)]. 
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Since s'(x) < 0, p(x) > 0, and p'(x) > over the region we are interested in, sgn(h' k e (a)) = 
sgn(p'(a)/\p(a)(—s'(a))] — p' (at)/[p(a,t){—s' (at))]. Note h' k e (a) has a zero at a = at- Con- 
sider the function t(x) — p'(x)/\p(x)(—s'(x))]. Using JJJ and © and simplifying, we find 
that t(x) = 1 + [1.3a; — .6]/[a; 3 + x 2 + .5x + .6]. Since x > g(l) = e the numerator is positive, 
and the denominator is growing faster than the numerator so t(x) is a decreasing function 
over our region of interest. This means that h' k ^(a) is exactly when a = at, is negative 
when a < at, and positive when a > at. 

Hence hk,t(a) has its unique maximum at a — at . Since this is true for any k and £ with 
o-k < o-t, all the components of a* are identical. 

So instead of maximizing h(a), we can maximize 

h(a) :=ep(ap^-. (11) 
a 

To accomplish this we will need two facts about exponentials that are easily verified by 
taking the appropriate derivatives. 

(Vx > 0)(1 -x< exp{-(a; + 0.5x 2 )}). (12) 

(VyeR)(ycxp{-y}<exp{-l}). (13) 

Let 5(a) := [l/a + .5/a 2 + .6/a 3 ]/[l + 1/a + .5/a 2 + .6/a 3 ], so p{a) = 1 - S(a). Using 
JT^Jl and l(T^|) , we have that 

h(a) < -cj exp{~c :) ((5(a) + .56(a) 2 )} 
1 

" a(S{a) + .56(a) 2 ) 

We can write l/[a(<5(a) + .5<5(a) 2 )] asp\(a)/p2(a), where pi andp2 are degree six polynomials 
positive for positive a. Furthermore, ^2(0) — Pi(a) = -2a 4 — .05a 3 + .35a 2 + 1.2a— 3.6 which 
is positive for a > g(l), and so p\(a)/p2(a) = 1 — [^2(0) — Pi( a )]/P2(o) < 1- Hence h(a) < 1, 
and Lemma |2 is proved. 

3.1 Running Time. In this section we derive an upper bound on the expected running 
time of the algorithm that is polynomial under certain conditions. Even if the graph does 
not meet these conditions, the algorithm can be used to generate perfect matchings. We 
only lack a priori bounds on what the running time is. 

The time spent inside the repeat loop is easy to compute. Computing the ratios 
M(f(A,i,j))/M(A) takes time 0(n). Choosing R takes only 0(n) time and O(logn) ex- 
pected random bits. Marking the permutation and changing row R and column j to also 
takes 0(n) time. The loop is run n times, so altogether 0(n 2 ) work is needed. 

The only question that remains is how often does the algorithm accept, that is, how 
small is per(A) / M(A)1 The expected amount of samples taken before acceptance will be 
M (A) /per (A). We noted in section^that Van der Waerden's inequality together with 
Stirling's formula may be used to show that per (A) > (v / 2~7m)(A/e)™. 
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Since g(a + l) > g(a) + l, wc know that g(a) > a, and so g(a + l)~ g(a) < l + .b/a + .Q/a 2 . 
Given this fact, it is straightforward to show inductively that for a > 2, 

g(a) < a + .51na + 1.65. 

With each row sum identically A, M(A) < ([A + .5 In A + 1.65]/e) n . Using 1 + x < e x , 

M(A) 1 / In A 1.65\" 1 , r- n „ /A 
per(A) V A A J V2™ 

We have proved the following 

Theorem 1 The expected running time needed to obtain a sample is 

O (V- 5 A' 5 "/ A 5.3 n/A ) (14) 
In particular, if A — for some constant 7, then the expected running time is 

( n 1.5+.5/ 7 ^ 

Running time using Markov chains. Jerrum and Sinclair jj] showed that the Markov chain 
of Broder jS| requires 0(n 2 \E\Fhin) time to generate an approximate sample, where \E\ is 
the number of edges in the graph and F is the ratio of almost matchings (with n — 1 edges) 
to perfect matchings. 

One way to count the number of almost matchings is just to choose which node on the 
left is unmatched (there are n ways to do this) and then use our algorithm to complete 
the almost matching as before. This shows that the number of almost matchings is at most 
nM(A), hence the ratio of almost matchings to perfect matchings is at most nM(r)/per(A), 
and the total time needed by the Broder chain will be worse than our algorithm by a factor 
of 0(n 3 In n). 

3.2 Estimating the Permanent. One form of Chcrnoff's Bound 0] is the following: 

Theorem 2 ChernofT's Bound Let Xx, ■ ■ ■ ,X t foe 0, 1 i.i.d. Bernoulli random variables 
with parameter p. Then for a < 1 



P 



In our algorithm, p = per(A)/ M(r). Hence after 0([M(r)/per(A)]\og(l/5)/a 2 ) steps, 
the algorithm will come within 1 + a of the true answer with probability at least 1 — 8. 

Theorem 3 The expected running time needed to obtain a 1 + a approximation with prob- 
ability at least 1 — 5 is 

O (n 1 - 5+ - 5 / nog{l / 5) /a 2 ^j . 
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4. Nearly Regular Graphs. Until now, we have assumed that the graph was regular 
so that each node has the same degree. We now show how to relax this assumption to deal 
with graphs that are merely close to being regular. 

In obtaining our upper bound, we did not utilize the regularity assumption at all, so the 
only time where it comes into play is in the lower bound. Van der Waerden's Inequality is 
easily extended to the case where all the row and column sums are 1 or larger. 

Suppose that when we normalize the row sums by dividing by the degree, the column 
sums are not exactly 1. For each column with sum less than 1, we normalize it by dividing 
by its column sum. Then all column and row sums are at least 1, and Van der Waerden's 
Inequality assures us that the permanent will be at least n\/n n . Taking account of the 
normalization, the original permanent will be at least (n\/n n ) Yl c <i c i- 

So if one or two node have degrees that are different from the rest by a factor of 2, the 
running time of this approach is unchanged. If all the degrees are in the set {A— c, . . . , A+c} 
for some constant c, then each c\ > (A — c)/(A + c), and their product (for sufficiently large 
n will be at least e - 2cn / A . When n/A is a constant, this is also a constant so the running 
time of the procedure will be of the same order as before. A similar argument shows that 
even if the degrees of all of the nodes vary by O(lnn), the running time of the procedure 
will still be polynomial. 

It can be shown that any bipartite perfect matching problem may be efficiently reduced 
to a perfect matching problem where the degree of the edges is either 2 or 3. Using a 
technique of Broder [3] we can then make the problem dense, and the degree of each edge 
will still only differ by 1, making it nearly regular. Hence the nearly regular problems we 
are approximating in this section are still jjP complete. 

It remains to be seen if this approach of sampling from Mine as an upper bound can be 
proven efficient in situations where the degrees are not close to being regular. The denseness 
assumption might make it possible to obtain a better lower bound than that given by Van 
der Waerden. 

Our algorithm runs quickly on any matrix where M(r)/per(A) is polynomial. The 
Jerrum Sinclair results show that the Markov chain approach runs quickly when the ratio 
of (n-l)-matchings (matchings with only n-1 edges) to perfect matchings is polynomial, and 
it would of course be useful to be to make precise the connection between these two ratios. 
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